perf: update gumbo utf8 decode #2735

flavorjones · 2022-12-20T16:41:57Z

What problem is this PR intended to solve?

Related to #2722

update gumbo utf8 decode() from https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ which apparently saves a shift instruction for every byte. benchmarking doesn't find a discernible performance improvement, though.

Have you included adequate test coverage?

N/A

Does this change affect the behavior of either the C or the Java implementations?

N/A

from https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ which apparently saves a shift instruction for every byte. benchmarking doesn't find a discernible performance improvement, though.

stevecheckoway

I don't think this should be merged. This moves from the slightly more optimized version to the slightly less optimized version.

stevecheckoway · 2022-12-20T18:57:20Z

gumbo-parser/src/utf8.c

-static inline uint32_t decode(uint32_t* state, uint32_t* codep, uint32_t byte) {
+uint32_t static inline
+decode(uint32_t* state, uint32_t* codep, uint32_t byte) {


It seems odd to put static inline between the return type and the function name. I don't think any of the other functions do that.

stevecheckoway · 2022-12-20T19:02:46Z

gumbo-parser/src/utf8.c

-  *codep =
-    (*state != UTF8_ACCEPT)
-      ? (byte & 0x3fu) | (*codep << 6)
-      : (0xff >> type) & (byte);
+  *codep = (*state != UTF8_ACCEPT) ?
+    (byte & 0x3fu) | (*codep << 6) :
+    (0xff >> type) & (byte);

-  *state = utf8d[256 + *state + type];
+  *state = utf8d[256 + *state*16 + type];
  return *state;


I believe this is the older, slower code. The *16 is the shift that Rich Felker eliminated by pre-multiplying

On 24th June 2010 Rich Felker pointed out that the state values in the transition table can be pre-multiplied with 16 which would save a shift instruction for every byte. D'oh! We actually just need 12 and can throw away the filler values previously in the table making the table 36 bytes shorter and save the shift in the code.

Ah, I see, that's confusing. OK, will close.

update gumbo's utf8 decode function with latest

45b2d11

from https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ which apparently saves a shift instruction for every byte. benchmarking doesn't find a discernible performance improvement, though.

flavorjones requested a review from stevecheckoway December 20, 2022 16:41

flavorjones mentioned this pull request Dec 20, 2022

explore further optimizing the HTML5 parser and serializer #2722

Open

flavorjones added topic/HTML5 topic/performance labels Dec 20, 2022

stevecheckoway reviewed Dec 20, 2022

View reviewed changes

flavorjones closed this Dec 20, 2022

flavorjones deleted the flavorjones-update-gumbo-utf8-decode branch December 21, 2022 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: update gumbo utf8 decode #2735

perf: update gumbo utf8 decode #2735

flavorjones commented Dec 20, 2022

stevecheckoway left a comment

stevecheckoway Dec 20, 2022

stevecheckoway Dec 20, 2022

flavorjones Dec 20, 2022

perf: update gumbo utf8 decode #2735

perf: update gumbo utf8 decode #2735

Conversation

flavorjones commented Dec 20, 2022

stevecheckoway left a comment

Choose a reason for hiding this comment

stevecheckoway Dec 20, 2022

Choose a reason for hiding this comment

stevecheckoway Dec 20, 2022

Choose a reason for hiding this comment

flavorjones Dec 20, 2022

Choose a reason for hiding this comment